refactor(integration_tests): unify bridge models and clear dbt deprecations by rabee05 · Pull Request #1325 · tuva-health/tuva

rabee05 · 2026-04-24T03:31:55Z

Problem

Every bridge model had two SELECTs gated by use_synthetic_data duplicated column lists that drifted apart.
No DAG edge from raw_data__* seeds to bridge models, so dbt build sometimes ran models before seeds and failed with Dataset raw_data not found.
Mixed model layouts (inline vs tuva_columns/tuva_extensions/tuva_metadata) and ~500 copies of the same cross-adapter type jinja across seed yml files.
dbt parse surfaced three deprecations and one BigQuery error (cast(… as varchar)).

Fix

tuva_source() macro — returns ref('raw_data__<table>') in synthetic mode, source('source_input', <table>) otherwise. Compile-time ref() registers the seed → model DAG edge.
Unified 15 bridge models on the tuva_columns / tuva_extensions / tuva_metadata layout.
Restored models/_sources.yml with optional input_database / input_schema (fall back to target.database / target.schema).
YAML anchors (*string, *datetime, *float) in both seed yml files.
Gated raw_data__* and synthetic_data__* seeds on use_synthetic_data.
Seed naming: patient_seed.csv → raw_data__patient.csv; added header-only CSVs for 6 missing clinical tables.
Cleared deprecations: +batch_size → +meta.batch_size; combination_of_columns nested under arguments:; varchar → {{ dbt.type_string() }} in extension tests.

How to test

Run dbt build --full-refresh with use_synthetic_data: true — seeds run before bridge models, extension-column tests pass on BigQuery.
Run dbt build --full-refresh with use_synthetic_data: false and input_database / input_schema pointed at a real input layer — bridge models read from source_input, no synthetic seeds materialize.

Breaking changes

None. Tuva package contract unchanged.

Author: SnowQuery — Healthcare Data Engineering & Architecture Consulting

…thetic mode * Rename patient_seed.csv to raw_data__patient.csv so every clinical input seed follows the raw_data__<table> convention that tuva_source() resolves to. * Add empty header-only CSVs for condition, encounter, location, medication, practitioner, procedure — gives each clinical table a seed relation in synthetic mode even when no synthetic data exists. * Replace repeated per-column cross-adapter jinja with YAML anchors (*string, *datetime, *float) in seeds/_seeds.yml, dropping ~640 lines of duplication. * Gate every raw_data__* seed on use_synthetic_data so non-synthetic runs no longer materialize unused seed tables.

…sion-column checks BigQuery rejects "cast(... as varchar)"; use the cross-adapter type macro so check_extension_columns_in_core_{eligibility,medical_claim, member_months,pharmacy_claim} tests compile on every warehouse.

Collapse the repeated per-column cross-adapter jinja (bigquery/databricks string, athena/databricks real, fabric datetime2, etc.) into three shared anchors — *string, *datetime, *float — and reference them from every column_types entry. Also gate each synthetic_data__* seed on use_synthetic_data so non-synthetic runs stop materializing seeds nothing refs. Drops ~260 lines of repetition from seeds/synthetic_data/synthetic_data_seeds.yml without changing runtime type resolution.

… at compile time Returns ref('raw_data__<table>') when use_synthetic_data is true, or source('source_input', <table>) otherwise. Because ref() is evaluated at parse time, dbt wires the seed as an upstream dependency of every bridge model that calls tuva_source(), so seeds run before their dependent models without an on-run-start hook or explicit ordering. Returns the Relation object (not a rendered string), so callers can use either "from {{ tuva_source('X') }}" or bind it with {% set r = tuva_source('X') %} for adapter.get_columns_in_relation(r).

… into single SELECT via tuva_source() Every bridge model had two duplicated SELECTs gated by use_synthetic_data. Replaced with one SELECT reading from tuva_source('<table>') — the macro swaps ref() vs source() at compile time so the toggle is invisible to the model. With _sources.yml restored, setting input_database + input_schema now points the same models at the user's own input layer when use_synthetic_data is false. No model edits needed.

…eprecation dbt now flags custom top-level config keys; nest the seed batch_size hint under +meta so the deprecation warning goes away.

dbt 1.10+ requires generic-test arguments to be nested under arguments (MissingArgumentsPropertyInGenericTestDeprecation). Updated the hcc_recapture staging/intermediate/final yml tests.

netlify · 2026-04-24T03:31:59Z

✅ Deploy Preview for thetuvaproject canceled.

Name	Link
🔨 Latest commit	`a4902f8`
🔍 Latest deploy log	https://app.netlify.com/projects/thetuvaproject/deploys/69eaff9c8255360008579aed

rabee05 added 8 commits April 24, 2026 03:00

fix(config): move +batch_size under +meta to clear CustomKeyInConfigD…

44aea0b

…eprecation dbt now flags custom top-level config keys; nest the seed batch_size hint under +meta so the deprecation warning goes away.

fix(tests): wrap unique_combination_of_columns args under arguments key

1bdc3c3

dbt 1.10+ requires generic-test arguments to be nested under arguments (MissingArgumentsPropertyInGenericTestDeprecation). Updated the hcc_recapture staging/intermediate/final yml tests.

Merge branch 'tuva-health:main' into main

13e5daf

github-project-automation Bot added this to The Tuva Project Apr 24, 2026

github-project-automation Bot moved this to 👀 Ready for Review in The Tuva Project Apr 24, 2026

Merge branch 'main' into main

a4902f8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(integration_tests): unify bridge models and clear dbt deprecations#1325

refactor(integration_tests): unify bridge models and clear dbt deprecations#1325
rabee05 wants to merge 9 commits intotuva-health:mainfrom
rabee05:main

rabee05 commented Apr 24, 2026

Uh oh!

netlify Bot commented Apr 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rabee05 commented Apr 24, 2026

Problem

Fix

How to test

Breaking changes

Uh oh!

netlify Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for thetuvaproject canceled.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

netlify Bot commented Apr 24, 2026 •

edited

Loading